Friendlier workflow errors (consolidated) by pranaygp · Pull Request #1849 · vercel/workflow

pranaygp · 2026-04-24T02:08:54Z

Summary

Consolidates the eight-PR friendlier-errors stack into a single PR. Inspired by @Schniz's stalled #706. Superseded by this PR: #1831, #1832, #1836, #1837, #1838, #1839, #1840.

What's included

Area	Change
`@workflow/errors`	New `SerializationError`, `WorkflowBuildError` classes (with optional `hint` field). `Ansi` rendering helpers (`frame`, `code`, `docs`, `dim`, `inline`) — now lives under the `@workflow/errors/ansi` subpath so the main entry doesn't pull `chalk` into every consumer. `FatalError.is(err)` widened to recognize any error with a `fatal: true` own property.
`@workflow/core` — context violations	Four structured error classes (`NotInWorkflowContextError`, `NotInStepContextError`, `NotInWorkflowOrStepContextError`, `UnavailableInWorkflowContextError`) applied to twelve user-facing throw sites. Each includes a docs link. `.message` / `.stack` are plain text — the colored framed form renders lazily via `[util.inspect.custom]` / `toString()`, so structured logs and log drains no longer contain raw `\x1B[...m` bytes. All four classes set `fatal = true`, so `createHook()`-from-a-step fails immediately instead of burning three retry attempts. Thrown errors redirect their stack to the user's call site via a shared `redirectStackToCaller` helper so terminal overlays (Next.js, Turbopack, VS Code) point at user code.
`@workflow/core` — serialization	`SerializationError` applied to all user-facing serialization boundaries: stream locking, unregistered classes, missing `WORKFLOW_DESERIALIZE`, step-function / workflow-function misuse, and dehydrate/hydrate failures for workflow args, step args, and return values.
`@workflow/core` — runtime logger	Structured logger gains `.child()` and `.forRun(runId, workflowName)` for stable per-run context, standardized `[workflow-sdk]` prefix, error stacks surfaced in log drains, clarified replay-timeout phrasing (warn while retrying vs. error when giving up).
`@workflow/core` — attribution	`describeError(err)` and `describeRunError({ errorCode, errorName })` compute user-vs-SDK attribution + class-aware hints, either from a live `Error` instance or from persisted failure-event fields. Exposed under the public `@workflow/core/describe-error` subpath for CLI / web consumption. Terminal logs at step-failure, max-retries, run-failure, and fatal-setup sites now include `errorAttribution` metadata and hint text.
`@workflow/core` — consistency	Remaining internal invariants (missing `startedAt`, VM `crypto.subtle.generateKey`, closure-vars outside a step context, `ENOTSUP`) now throw `WorkflowRuntimeError` so they are attributed to the SDK. `defineHook().resume()` formats schema validation failures as a readable bulleted list instead of a raw JSON dump.
`@workflow/builders` — build-time	Failed esbuild phases, unresolved built-in steps, and empty esbuild output now throw `WorkflowBuildError` with a `hint` pointing at the likely fix.

Fixes from manual testing (`createHook()` inside a step)

Surfaced three issues after running the earlier stack end-to-end and addressed in the final commits:

ANSI bytes were leaking into .message because chalk resolves at construction time — fixed by storing plain text and rendering pretty lazily (see context-errors-plain-message changeset).
Context violations were retrying 3× and producing 4 duplicated log blocks — fixed by marking them fatal: true and widening FatalError.is() (see context-errors-fatal changeset).
Duplicated captureStackTrace feature-detect in two files — fixed by extracting a shared redirectStackToCaller helper (see capture-stack-shared changeset).

Follow-up round (latest commits)

Manual testing uncovered two more issues, fixed in the last two commits on this branch:

SerializationError still looped through 4 retries. Root cause: dehydrateStepReturnValue() was called outside the step-handler try/catch, so FatalError.is() never saw the error. Two-part fix: mark SerializationError as fatal = true, and move the dehydration call inside the user-code try/catch in step-handler.ts so the error routes through userCodeFailed → step_failed (see serialization-error-fatal changeset). Same non-POJO return now fails in ~1.6s / 1 block, not ~21s / 4 blocks.
Snapshot tests for log output. Added toMatchInlineSnapshot-backed tests for describeError() payloads (every attribution path) and the scoped-logger call signature for the two canonical runtime failure sites. Regression gate on the exact field shapes users see in their log drains.

Post-review polish (latest commits)

Driven by manual testing in tmux + reviewer feedback after the initial review pass:

Logger layout overhaul. Composed framing + structured fields (user error · FatalError, run wrun_…, step step_… · add (./workflows/x)) between the framing line and the stack body, instead of after — the most useful info now sits at the top of each block instead of being buried under 30+ stack lines.
Stack trim. composeLogLine() drops framework-internal frames (node_modules/.pnpm/, node:internal/, Turbopack-bundled node_modules__pnpm_* and _next_dist_* chunks) and caps the surviving frames at 6. Suppressed runs emit one summary line so users know the stack was trimmed.
Friendly step / workflow names in framings. step//./workflows/foo//bar renders as bar (./workflows/foo) everywhere we log a step or workflow name, via new formatStepName / formatWorkflowName helpers in @workflow/utils.
Consistent ╰▶ hint: / ╰▶ docs: framing across all errors with hints or docs slugs. WorkflowError, SerializationError, and WorkflowBuildError now share one appendFramedDetails helper, matching the box-drawing tree ContextViolationError already used. The blank-line-separated Learn more: <url> style is gone.
Single hint, lives on the error message. Dropped the duplicate hint field from runtime logger metadata. Hints get serialized into the event log on the message itself, rehydrate on the workflow side, and surface in observability. The previous logger-only hint duplicated stderr but never made it past the step boundary.
Better serialization-failure hint. Updated to point at the foundations doc instead of a hardcoded (plain objects, arrays, primitives, …) list, which drifted out of sync as the supported types grew. Reuses across step args, workflow args/return, stream messages, and any other site that goes through formatSerializationError.
Workbench route no longer double-logs failures. app/api/workflows/start/route.ts's SSE wrapper was catching WorkflowRunFailedError rejection and re-emitting it via console.error('Error in workflow stream:', error) + controller.error(error), which then triggered Next.js's ⨯ failed to pipe response overlay. The SDK already logs the failure cleanly, so the wrapper now closes the stream gracefully on WorkflowRunFailedError.
ErrorStackBlock title trim in web observability. Multi-line error messages (Failed to serialize step return value\n╰▶ hint: …) were rendering the entire framed body in the card title, pushing the copy button off-screen. Title now shows just the first non-empty trimmed line with single-line truncation; full message stays in the body via Error.stack.
Persisted error message no longer embeds the machine step name. Step "step//./.../foo" failed after N retries: … and Step "step//.../foo" exceeded max retries (…) now read Step failed after N retries: … / Step exceeded max retries (…). Observability already attributes the event to a specific step via the UI tree.
Retry summary reads 4 attempts · 3 max retries instead of 3 retries (which was ambiguous next to "4 attempts").
WorkflowError no longer leaks cause: undefined as an enumerable own property when no cause is provided — every subclass's util.inspect(err) output stays clean.
ContextViolationError [util.inspect.custom] counted message lines correctly when slicing the stack tail, fixing duplicated ╰▶ docs: lines on multi-detail errors.
Step-handler integration tests for fatal vs non-fatal retry-loop wiring: a fatal: true error emits one step_failed and zero step_retrying; a non-fatal Error retries via step_retrying until the budget is exhausted, then emits step_failed. Catches the silent-regression case where fatal = true is removed from a class but FatalError.is() unit tests stay green.

Addressed review feedback

VaguelySerious (Friendlier context-violation errors + Ansi renderer #1831): "I don't like errors package depending on chalk" — Ansi is now on a subpath; the main entry has no chalk dep path.
Copilot (Friendlier workflow errors (consolidated) #1849): "Duplicated Error.captureStackTrace guard/cast" — extracted to packages/core/src/capture-stack.ts.
Copilot (Friendlier context-violation errors + Ansi renderer #1831): Ansi subpath export concern — same subpath fix covers this.
VaguelySerious (Friendlier workflow errors (consolidated) #1849, blocking): "[util.inspect.custom] duplicates every detail line" — fixed by counting message lines instead of slicing only the first.
VaguelySerious (Friendlier workflow errors (consolidated) #1849): "cause: undefined enumerable own property" — wrapped the assignment in if (options?.cause !== undefined).
VaguelySerious (Friendlier workflow errors (consolidated) #1849): "Add an integration test on the retry loop" — added a fatal-vs-retryable describe block that asserts step_failed/step_retrying event counts directly.
VaguelySerious (Friendlier workflow errors (consolidated) #1849): "Drastically shorten the changesets" — consolidated 15 file-by-file entries into a single friendlier-errors.md covering core/errors/builders/utils.

Manual test plan

All sections below exercise different parts of the stack. Start cd workbench/nextjs-turbopack && pnpm dev unless otherwise noted.

1. Context-violation errors (phase 1 + 2 + followups)

A convenient smoke route at workbench/nextjs-turbopack/app/api/friendlier-errors-smoke/route.ts:

import { NextResponse } from 'next/server';
import { createHook, sleep, getStepMetadata, getWorkflowMetadata } from 'workflow';

export async function GET(req: Request) {
  const which = new URL(req.url).searchParams.get('which');
  try {
    if (which === 'createHook') createHook();
    if (which === 'sleep') await sleep('1s');
    if (which === 'getStepMetadata') getStepMetadata();
    if (which === 'getWorkflowMetadata') getWorkflowMetadata();
    return NextResponse.json({ ok: true });
  } catch (err) {
    console.error(err);
    return NextResponse.json(
      { name: (err as Error).name, message: (err as Error).message, stack: (err as Error).stack },
      { status: 500 }
    );
  }
}

createHook() outside workflow — ?which=createHook. Terminal shows a framed box with title `createHook()` can only be called inside a workflow function and a docs: https://…/workflow/create-hook line (polished — no longer note: Read more about…).
sleep() outside workflow — ?which=sleep. Same framing, docs URL ends in .../workflow/sleep.
getStepMetadata() in a workflow function (not a step) — add to a "use workflow" file; expect title "can only be called inside a step function".
getWorkflowMetadata() in application code — ?which=getWorkflowMetadata. Expect "workflow or step function".
resumeHook() inside a workflow — call resumeHook(token, payload) from inside a "use workflow" function. Title: `resumeHook()` cannot be called from a workflow context, plus a line this call was made from the workflow//./src/workflows/example.ts//myWorkflow workflow context. with the workflow/ prefix dimmed.
Stack points at user code — in the JSON response body, the first at ... line of err.stack should reference your route handler, not a frame inside @workflow/core. Same goes for the Next.js dev overlay.
No functionName leak — the JSON response body should NOT contain a functionName property on the error object (it used to via the old constructor param-property).
ANSI bytes don't leak into .message / .stack (new in this PR) — the JSON response body's message and stack strings must contain no \x1B[ bytes; they're plain text. In the terminal, console.error(err) still renders the pretty framed version via util.inspect.
Context violations fail fast (no 4× retries) (new in this PR) — call createHook() inside a "use step" function:
```
async function add(a: number, b: number): Promise<number> {
  'use step';
  createHook();
  return a + b;
}
```
Run a workflow that calls add(…). You should see one [workflow-sdk] fatal-error log block (Step "X" threw a FatalError — bubbling up...), not four. The step fails immediately.

2. Runtime logger metadata (phase 3)

Standardized prefix — all runtime log lines start with [workflow-sdk]. Grep your terminal output; there should be no [Workflows] or other prefixes.
Run context attached without repetition — trigger any workflow; log lines carry workflowRunId and workflowName as metadata fields instead of being baked into the message string.
Replay-timeout phrasing:
- While retries remain: warn level, phrasing includes "took too long — will retry".
- After final attempt: error level, phrasing includes "gave up".
Error stack surfaces in log drain — set up a log drain (or pipe to a file) and confirm the stack is included as a structured field (errorStack), not just the one-line error message.

3. `SerializationError` (phase 4)

Cause a user-facing serialization failure:

async function stepThatLeaks(): Promise<Map<string, { fn: () => void }>> {
  'use step';
  return new Map([['foo', { fn: () => console.log('unserializable') }]]);
}

Run the workflow. Expect [workflow-sdk] log with errorName: 'SerializationError'.
errorAttribution: 'user'.
Hint is rendered as a ╰▶ hint: line on the error message itself — the URL points at https://workflow-sdk.dev/docs/foundations/serialization (foundations doc, not a hardcoded type list). The hint travels with the persisted event into observability.
Stream locking: call getWritable('x').getWriter() twice on the same stream — expect a SerializationError with the same ╰▶ hint: framing.

4. `describeError` attribution (phase 5)

For each of these, inspect the [workflow-sdk] log at failure time:

Plain user Error — throw new Error('boom') from a step → errorAttribution: 'user', no framed hint on the message.
SerializationError → errorAttribution: 'user' + ╰▶ hint: line on the message pointing at the foundations doc.
Context-violation error → errorAttribution: 'user' + ╰▶ docs: line on the message pointing at the per-API reference page.
WorkflowRuntimeError — throw new WorkflowRuntimeError('invariant') from a step → errorAttribution: 'sdk'.
Replay timeout — set WORKFLOW_REPLAY_TIMEOUT_MS=50, run a non-trivial workflow → after retries exhaust, errorAttribution: 'sdk', errorCode: 'REPLAY_TIMEOUT'.
Max-delivery exhaustion — write a step that always throws; after queue max-delivery budget exhausts → errorAttribution: 'sdk', errorCode: 'MAX_DELIVERIES_EXCEEDED'.

Note: describeError(err) and describeRunError({errorCode, errorName}) still return a hint field for CLI / web consumers, but the runtime logger no longer adds it as a duplicate metadata field. Hints live on the error message instead, so they survive serialization and surface in observability.

5. Consistency pass (phase 6)

Zod schema failure on defineHook().resume() — call hook.resume(token, invalidBody). Expect a readable bulleted list of validation issues (one per line, at "field": message), not a raw JSON dump of ZodError.issues.
crypto.subtle.generateKey() inside workflow VM — call it from a "use workflow" function. Expect a clear message explaining why it's disabled + "move this into a step function", with errorAttribution: 'sdk'.

6. `describeError` subpath (phase 7 foundation)

Create scratch.ts at repo root:

import { describeRunError, describeError } from '@workflow/core/describe-error';
import { SerializationError, WorkflowRuntimeError } from '@workflow/errors';

console.log(describeRunError({ errorCode: 'USER_ERROR', errorName: 'SerializationError' }));
console.log(describeRunError({ errorCode: 'USER_ERROR', errorName: 'NotInWorkflowContextError' }));
console.log(describeRunError({ errorCode: 'RUNTIME_ERROR' }));
console.log(describeRunError({ errorCode: 'REPLAY_TIMEOUT' }));
console.log(describeRunError({ errorCode: 'MAX_DELIVERIES_EXCEEDED' }));
console.log(describeRunError({ errorCode: 'USER_ERROR' }));

console.log(describeError(new SerializationError('boom')));
console.log(describeError(new WorkflowRuntimeError('invariant')));
console.log(describeError(new Error('plain')));

Run pnpm tsx scratch.ts.

describeRunError({ errorCode: 'USER_ERROR', errorName: 'SerializationError' }) → { attribution: 'user', errorCode: 'USER_ERROR', hint: 'A value…serialized…' }.
describeRunError({ errorCode: 'RUNTIME_ERROR' }) → { attribution: 'sdk', hint: 'This is an internal workflow SDK error…' }.
Live-error parity — describeError(new SerializationError('x')) and describeRunError({ errorCode: 'USER_ERROR', errorName: 'SerializationError' }) return the same shape and hint string.
Subpath import works — TypeScript resolves @workflow/core/describe-error and pnpm tsx runs without module-resolution errors.

7. `WorkflowBuildError` (phase 8)

Exercise the build pipeline, not the runtime. Use workbench/nextjs-turbopack and run pnpm build (not pnpm dev).

Syntax error in a workflow file → in the build output, expect a WorkflowBuildError titled "Build failed during workflows bundle" followed by a blank line and hint: Review the esbuild errors above…. The original esbuild errors remain printed above (not suppressed).
Unresolved built-in steps — mv node_modules/workflow node_modules/workflow-bak and run pnpm build. Expect WorkflowBuildError: Failed to resolve built-in steps sources. + hint: run \pnpm install workflow`…`. Restore afterwards.
Empty workflow directory — move all workflow files aside. Expect WorkflowBuildError: No output files generated from esbuild + hint mentioning "use workflow" / "use step" directives.
.is() discriminator — in a scratch script: WorkflowBuildError.is(new WorkflowBuildError('x', { hint: 'y' })) returns true; WorkflowBuildError.is(new Error('x')) returns false.
Runtime paths unaffected — run a normal workflow at runtime (pnpm dev). Confirm no WorkflowBuildError shows up; this class is build-time only.

8. New `@workflow/errors/ansi` subpath (final PR)

import { Ansi } from '@workflow/errors' no longer works (and wasn't intended to — the helpers were always namespaced). Confirm import * as Ansi from '@workflow/errors/ansi' resolves.
A package that imports only error classes (import { SerializationError } from '@workflow/errors') no longer pulls chalk into its bundle. Check a production bundle or pnpm why chalk from a dependent context.

9. `FatalError.is()` widening (final PR)

FatalError.is(new FatalError('x')) — true.
FatalError.is(new NotInWorkflowContextError('createHook()', 'https://…')) — true (via fatal: true own property).
FatalError.is(new Error('x')) — false.
FatalError.is({ fatal: true }) — false (must be an Error-shaped value).

Unit tests

All packages typecheck clean; relevant test files pass:

pnpm --filter @workflow/errors test — 26 tests (Ansi, SerializationError, WorkflowBuildError, FatalError widening, framed-details consolidation)
pnpm --filter @workflow/core exec vitest run src/log-format.test.ts src/logger.test.ts src/context-errors.test.ts src/describe-error.test.ts src/runtime/step-handler.test.ts — 76 tests, including new composeLogLine snapshot tests and the fatal-vs-retryable step-handler integration suite
pnpm --filter @workflow/builders test — 129 tests
pnpm typecheck — clean across workspace

Smoke-test harness (local)

For reviewers reproducing the manual test plan, a one-shot tmux harness lives at /tmp/wf-1849-smoke.sh (built during testing). Pane layout:

┌─ dev server :3010 ──────────┬─ workflow web :3011 ──┐
│  pnpm dev (next.js + sdk)   │  pnpm workflow web    │
├──────────────────────────────┴───────────────────────┤
│  /tmp/wf-1849-smoke.sh ok | fatal | retryable |     │
│                       serialization | context-step  │
│                       ctx-create | ctx-sleep | ...  │
│                       all                            │
└──────────────────────────────────────────────────────┘

The harness's workbench/nextjs-turbopack/workflows/smoke.ts and app/api/friendlier-errors-smoke/route.ts are kept out of git (workbench-local) — set them up via the workflow-init-style instructions in the prior conversation if you want to reuse the harness.

🤖 Generated with Claude Code

Phase 1: Add Ansi rendering helpers (frame, hint, note, help, code, inline) to @workflow/errors, and a chalk mock for readable snapshot tests. Phase 2: Add four context-violation error classes to @workflow/core (NotInWorkflowContextError, NotInStepContextError, NotInWorkflowOrStepContextError, UnavailableInWorkflowContextError) and apply them to all twelve user-facing throw sites so errors now include docs links and a structured "what/why/fix" frame. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Tighten phase 1 changeset to a single sentence (per pranaygp review) and switch to double-quoted frontmatter (per Copilot + repo convention). - Implement `ansifyName` to actually apply dim styling to workflow/ / step/ prefixes; add an `Ansi.dim` helper to `@workflow/errors` so callers don't need to import chalk directly. - Remove the `void getWorkflowMetadata;` workaround in context-errors.ts by dropping the unused value import (we only needed the type and symbol). - Render the plain-Error throw in `workflow/get-workflow-metadata.ts` with `Ansi.frame` + docs link so the VM path matches the structured-class styling from the sibling step path (still uses a plain Error to avoid the module-init cycle). - Guard `buildUnderline` against zero-length markers so a stray empty token can't produce a negative `String.repeat` count.

Adds a `.child()` and `.forRun(runId, workflowName)` child-logger API to the structured logger so runtime/step code doesn't have to repeat `workflowRunId`/`workflowName`/`stepId` on every call. Normalizes error metadata to structured `errorName` / `errorMessage` / `errorStack` fields instead of ad-hoc `error: err.message` strings, and adds comments to silent catches that swallow expected idempotency conflicts. Also folds in the pending changes from #1812 so that PR can be closed: - Standardize the console prefix to `[workflow-sdk]`. - Split the replay-timeout log into a warn-while-retrying vs. error-when-giving-up, and surface the underlying error when we can't mark a timed-out run as failed. - Include the error stack in the "Fatal runtime error during workflow setup" log and in the top-level user-code workflow error log so the stack surfaces in flattened log drains. - Drop the `[Workflows] "<runId>" - ` prefix from `buildWorkflowSuspensionMessage` — the structured logger now attaches run context. Supersedes #1812. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Phase 4 of friendlier errors: introduce a `SerializationError` class with an optional `hint` and a docs link (workflow-sdk.dev/err/serialization-failed), and adopt it at every user-facing serialization boundary in @workflow/core: - Locked ReadableStream at a workflow boundary - Unregistered class / missing `classId` / missing `WORKFLOW_DESERIALIZE` - Attempting to return step functions to clients or call workflow functions directly - Webhook `respondWith()` called outside a step - `dehydrate*` / `getSerializeStream` failures (workflow args/return, step args/return, stream chunks) Internal invariants (format prefix length checks, unknown format bytes, missing `STREAM_NAME_SYMBOL`, encryption key/size guards, etc.) now throw `WorkflowRuntimeError` instead of plain `Error` so the classifier and logger treat them consistently. `formatSerializationError` now returns `{ message, hint }` so the hint fragment can be rendered with the standard SerializationError framing instead of being baked into the message string. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add describeError() that derives attribution and class-aware hints from existing error classes + RUN_ERROR_CODES — no event data changes. Wire into step failures, max-delivery exhaustion, run failures, and fatal setup errors so terminal logs include errorAttribution and a hint for known error types. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- `describeError(err, errorCode?)` now accepts an optional precomputed `RunErrorCode`. `classifyRunError(err)` only narrows to USER_ERROR / RUNTIME_ERROR, so the REPLAY_TIMEOUT and MAX_DELIVERIES_EXCEEDED branches were previously unreachable from the step / run failure log sites. Callers that know the failure category (runtime.ts for replay timeout and max-deliveries exhaustion) now pass the code in. - Context-violation checks use `instanceof` against the actual classes from context-errors.ts instead of a name-string set. Type-safe + survives class renames. - Wire the new hints through to the REPLAY_TIMEOUT and MAX_DELIVERIES_EXCEEDED log sites so those branches actually render a hint now. - 3 new tests cover the reachable code paths + precomputed-code override. - Changeset frontmatter switched to double quotes per repo convention.

Internal invariants now use WorkflowRuntimeError so describeError attributes them to the SDK: missing startedAt, VM generateKey, closure-vars outside step context, ENOTSUP. defineHook().resume() formats schema validation failures as a readable list instead of a JSON blob. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Observability renderers read persisted run_failed / step_failed event data, not live Error instances. describeRunError takes { errorCode, errorName } and returns the same { attribution, hint } shape as describeError, so the CLI and web UI can derive user-vs-SDK framing from the event log directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add `WorkflowBuildError` class in `@workflow/errors` with optional `hint` for an actionable next step, and apply it in `@workflow/builders` at user-facing sites: failed esbuild phases, unresolved built-in steps, and empty esbuild output now throw `WorkflowBuildError` with a hint pointing at the likely fix. Runtime invariants remain plain `Error`.

…docs link, redirect stack - Drop the readonly `functionName` param-property on context-error classes so util.inspect no longer prints a trailing `{ functionName: 'foo()' }` block. - Replace the `DocLink` ("label: https://…") shape with a plain `DocsUrl` template-literal type. Error output now renders a single clean line: `docs: https://…` (new `Ansi.docs` helper) instead of the noisier "note: Read more about foo(): https://…". - Add throw helpers (`throwNotInWorkflowContext`, etc.) that call `Error.captureStackTrace(err, stackStartFn)` on V8 engines so the top frame of the thrown error points at the user's call site instead of at the gate function inside the framework. Callers pass themselves as the boundary. - Refactor `defineHook()` (both root and `/workflow`) to use named function closures rather than `this.create`/`this.resume`, since the stack redirect relies on a stable function identity that survives destructuring. - Update context-errors.test.ts to snapshot the new `docs:` framing and to add a regression test asserting the top stack frame is the user call site.

changeset-bot · 2026-04-24T02:08:58Z

🦋 Changeset detected

Latest commit: 5b8d791

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 23 packages

Name	Type
@workflow/core	Patch
@workflow/errors	Patch
@workflow/builders	Patch
@workflow/utils	Patch
@workflow/cli	Patch
@workflow/next	Patch
@workflow/nitro	Patch
@workflow/vitest	Patch
@workflow/web-shared	Patch
@workflow/web	Patch
workflow	Patch
@workflow/world-testing	Patch
tarballs	Patch
@workflow/world-local	Patch
@workflow/world-postgres	Patch
@workflow/world-vercel	Patch
@workflow/astro	Patch
@workflow/nest	Patch
@workflow/rollup	Patch
@workflow/sveltekit	Patch
@workflow/vite	Patch
@workflow/nuxt	Patch
@workflow/ai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-04-24T02:09:00Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
example-nextjs-workflow-turbopack	Ready	Preview, Comment	May 4, 2026 4:20am
example-nextjs-workflow-webpack	Ready	Preview, Comment	May 4, 2026 4:20am
example-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-astro-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-express-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-fastify-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-hono-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-nitro-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-nuxt-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-sveltekit-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-tanstack-start-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workbench-vite-workflow	Ready	Preview, Comment	May 4, 2026 4:20am
workflow-docs	Ready	Preview, Comment, Open in v0	May 4, 2026 4:20am
workflow-swc-playground	Ready	Preview, Comment	May 4, 2026 4:20am
workflow-tarballs	Ready	Preview, Comment	May 4, 2026 4:20am
workflow-web	Ready	Preview, Comment	May 4, 2026 4:20am

github-actions · 2026-04-24T02:09:04Z

🧪 E2E Test Results

✅ All tests passed

Summary

	Passed	Skipped	Total
✅ ▲ Vercel Production	1011	67	1078
✅ 💻 Local Development	1090	86	1176
✅ 📦 Local Production	1090	86	1176
✅ 🐘 Local Postgres	1090	86	1176
✅ 🪟 Windows	98	0	98
✅ 📋 Other	552	36	588
Total	4931	361	5292

Details by Category

✅ ▲ Vercel Production

App	Passed	Skipped
✅ astro	91	7
✅ example	91	7
✅ express	91	7
✅ fastify	91	7
✅ hono	91	7
✅ nextjs-turbopack	96	2
✅ nextjs-webpack	96	2
✅ nitro	91	7
✅ nuxt	91	7
✅ sveltekit	91	7
✅ vite	91	7

✅ 💻 Local Development

App	Passed	Skipped
✅ astro-stable	92	6
✅ express-stable	92	6
✅ fastify-stable	92	6
✅ hono-stable	92	6
✅ nextjs-turbopack-canary	79	19
✅ nextjs-turbopack-stable	98	0
✅ nextjs-webpack-canary	79	19
✅ nextjs-webpack-stable	98	0
✅ nitro-stable	92	6
✅ nuxt-stable	92	6
✅ sveltekit-stable	92	6
✅ vite-stable	92	6

✅ 📦 Local Production

App	Passed	Skipped
✅ astro-stable	92	6
✅ express-stable	92	6
✅ fastify-stable	92	6
✅ hono-stable	92	6
✅ nextjs-turbopack-canary	79	19
✅ nextjs-turbopack-stable	98	0
✅ nextjs-webpack-canary	79	19
✅ nextjs-webpack-stable	98	0
✅ nitro-stable	92	6
✅ nuxt-stable	92	6
✅ sveltekit-stable	92	6
✅ vite-stable	92	6

✅ 🐘 Local Postgres

App	Passed	Skipped
✅ astro-stable	92	6
✅ express-stable	92	6
✅ fastify-stable	92	6
✅ hono-stable	92	6
✅ nextjs-turbopack-canary	79	19
✅ nextjs-turbopack-stable	98	0
✅ nextjs-webpack-canary	79	19
✅ nextjs-webpack-stable	98	0
✅ nitro-stable	92	6
✅ nuxt-stable	92	6
✅ sveltekit-stable	92	6
✅ vite-stable	92	6

✅ 🪟 Windows

App	Passed	Failed	Skipped
✅ nextjs-turbopack	98	0	0

✅ 📋 Other

App	Passed	Skipped
✅ e2e-local-dev-nest-stable	92	6
✅ e2e-local-dev-tanstack-start-stable	92	6
✅ e2e-local-postgres-nest-stable	92	6
✅ e2e-local-postgres-tanstack-start-stable	92	6
✅ e2e-local-prod-nest-stable	92	6
✅ e2e-local-prod-tanstack-start-stable	92	6

📋 View full workflow run

Replace util.inspect's default object dump (which quote-escapes multi-line stacks and paragraph hints into a single-line JSON-y blob) with a workflow-aware formatter that composes the entire log line into a single string passed to console.error / console.warn. Highlights of the new output: - Per-run / per-step IDs render with their parsed friendly names so users see `wrun_… · simple (./workflows/1_simple)` instead of just the raw `workflowName: 'workflow//./workflows/1_simple//simple'`. - Color-coded attribution badge (user error red / sdk error magenta) paired with the error class in bold. - Hints render as a paragraph under `hint:` rather than a backslash- `\n`-escaped string. - Drops redundant fields (errorStack always; errorMessage when it's already in the parent message) to avoid double-printing. - Unknown fields fall through as a sorted `key value` tail so we never silently drop log information. @workflow/errors/ansi gains bold/red/magenta helpers used by the formatter. The web / web-shared packages don't consume stderr — they read structured event payloads from the World event log — so this is presentation-only at the runtime layer.

Resolve conflicts: - packages/builders/src/base-builder.ts — keep WorkflowBuildError import alongside main's removal of unused usesVercelWorld. - packages/core/src/serialization.ts — main extracted serialization into modular ./serialization/{client,step,workflow}.ts and reformatted the error wrapper. Re-apply the SerializationError adoption on top of the new structure: - ./serialization/errors.ts now returns { message, hint } so callers can pass the hint through SerializationError instead of baking it into the message. - The 3 modular serializers (client/step/workflow) and 4 dehydrate/ hydrate wrappers in serialization.ts now throw SerializationError with a hint, unwrapping any inner SerializationError cause to keep the more specific outer context label ("workflow arguments" vs generic "workflow value"). - User-facing throws (ReadableStream is locked / step function not found / workflow functions cannot be called directly / respondWith outside step / step functions cannot be deserialized in client context) are now SerializationError with hints. - Internal invariants (stream-name validation, missing STREAM_NAME_ SYMBOL, closure-vars outside step context) are now WorkflowRuntime- Error so they get classified consistently.

TooTallNate

Read through the consolidated stack carefully — context-violation classes, capture-stack helper, describe-error, the errors-package additions and ANSI subpath, the logger rewrite, and the step-handler dehydrate move. The overall direction is good and a lot of it is well-executed; the structured FramedContent / renderPlain / renderPretty split is the right architecture for "plain message + pretty terminal rendering" and the subpath split for chalk is correct. Finding most of the right tradeoffs.

I have one blocking item, plus a handful of follow-up concerns and observations to raise before merge.

Blocking

`pr-artifacts/` directory needs to be removed before merge

Six files still in pr-artifacts/ (01-... through 05-... plus README). PR description explicitly says "Remove before merge" — they shouldn't ship to git history. Easy git rm pr-artifacts/, but it's the kind of thing that gets forgotten in the merge rush.

Higher-risk concerns

`FatalError.is()` widening is a runtime-behavior change

The static is() predicate now returns true for any Error carrying a truthy fatal property — not just name === 'FatalError'. This is the documented intent and enables SerializationError, ContextViolationError, etc. to opt in without inheritance. But it changes retry semantics — anything FatalError.is() returns true for skips retry budget.

Risk: a user-defined error class that incidentally has fatal = true (someone's class CustomError { fatal = true }) will silently become non-retryable in workflow runs. For most users this is fine; for anyone who rolled their own "fatal" convention, this is a surprise.

Two requests:

The friendlier-errors-consistency changeset should be minor, not patch, for @workflow/errors. Adding new error classes is minor; widening a public is() predicate is at minimum minor. Verify.
Add a test that asserts a non-Error POJO with { fatal: true } does NOT pass FatalError.is() (the isError guard handles this, but lock it in).

Step-handler dehydrate move is correct but undertested

The "move dehydrate inside the user-code try/catch" change is the highest-leverage runtime-behavior change in this PR (along with the FatalError.is widening above). I traced it carefully and the new flow is correct: dehydrate failures now route through userCodeError → fatal branch → step_failed → no retry. The if (!userCodeFailed) guard is right; the new try/catch funnels into the same handler block. Confirmed end-to-end:

dehydrateStepReturnValue only ever throws SerializationError (own wrapping catch in serialization.ts)
SerializationError has readonly fatal = true
FatalError.is(SerializationError) is true via the new widening
Routes to fatal branch → step_failed event → no step_retrying

But step-handler.test.ts doesn't test this path at all. The test diff only updates mock shapes for the new forRun/child logger methods. The dehydrate mock at lines 118-127 is mockResolvedValue(...) — it never rejects. Add a test where it rejects with SerializationError and assert step_failed is emitted exactly once and step_retrying is never emitted. This is the regression-gate test the PR most needs and doesn't have.

There's also one subtle behavior change worth calling out: errors from trace() setup itself (OTEL config, etc.) used to escape to the queue (48 redeliveries). They now get caught by the new inner try/catch and treated as user-step failures (4 attempts, attributed to user). Probably fine — OTEL setup failures are deterministic — but worth a sentence in the changeset.

Other observations (non-blocking)

Minor design

hint is baked into .message in WorkflowBuildError and SerializationError (super(\${message}\n\n${options.hint}`)). Future renderers that want to format hints separately can't recover the bare message. Consider keeping super(message)plain and exposinghint` separately; let renderers compose. Not blocking, but locks in a format.
Box-drawing implemented twice: renderPlain in context-violation-error.ts re-implements the framing structure of Ansi.frame to avoid pulling chalk into core. Defensible for the dependency reason (and the comment is clear), but the two implementations can drift. Worth a cross-reference comment in both places.
Eight new throw helpers (throwNotInWorkflowContext etc.) for four classes. Could be one generic throwContextViolation(Class, fnName, docsUrl, stackStartFn). Current shape is more autocomplete-friendly; defensible either way.

Logger / log-format

ANSI bytes can leak into stderr drains. formatLogMetadata uses Ansi.bold / Ansi.red etc. which call chalk directly. Tests pass because vitest has no TTY (chalk level 0 → pass-through). But if anyone runs pnpm dev with FORCE_COLOR=1, or stderr is piped through a wrapper that is a TTY upstream of a structured drain, ANSI bytes land in the message field. The OTel span event path (addEvent({ message, ...merged }) in logger.ts:73) is clean — it uses the unformatted message. But the console path is environment-dependent. Suggest using a chalk.Instance({ level: 0 }) for log formatting, or stripping ANSI before stderr write. Not blocking, but worth knowing.
stepLogger.* callsites bypass forRun-based scoping. step-handler.ts has 6 callsites (lines 371, 591, 644, 711, 764, 778) that use the namespace-step stepLogger and manually re-pass { workflowRunId, workflowName, stepId, stepName } each time. The runtimeLogger.forRun(...) migration is great where it's done; these bypass it. Could be stepLogger.forRun(...).child({ stepId, stepName }).
Lowercase [workflow] prefix still exists in runtime/world.ts:90, next/src/loader.ts:648, next/src/builder-deferred.ts:153/174/1010. These are direct console.* calls (not logger), so technically out of scope for the prefix-standardization changeset, but worth noting if "all SDK output prefixed with [workflow-sdk]" is the goal.

Module-realm hazards

describe-error.ts mixes Class.is(err) (duck-typed; survives ESM/CJS dual-package hazards) with err instanceof WorkflowRuntimeError (nominal; doesn't survive). Recommend .is() everywhere for consistency in a multi-realm ecosystem.

Minor

inline in errors/src/ansi.ts (~100 lines for the tagged template that builds Rust-compiler-style underlines) is impressively complex. Worth checking it has snapshot tests for: no markers, single marker, multiple markers, marker at end of line, color-on vs color-off, multi-line input. Skimmed the test file; coverage is thinner than the implementation deserves.
The "PR description" mentions esbuild.context().rebuild() failures bypass the new WorkflowBuildError path. The most common build-error case (unresolved imports during HMR) won't get the friendly framing. Worth a follow-up issue.

Doc

Some doc gaps for users on whether error message strings are stable API. The new describeError function consumes err.name and class identity, which suggests messages aren't API. Worth adding a CONTRIBUTING note.

To summarize what I'd want before merge:

Remove pr-artifacts/.
Add the regression-gate test for dehydrateStepReturnValue rejecting in step-handler.test.ts.
Verify changeset bumps for the FatalError.is widening are correct (minor at least).
Add the isError guard test locking in that { fatal: true } POJOs are not classified as fatal.

Everything else is non-blocking and could be follow-ups.

VaguelySerious

AI review: blocking issues found

VaguelySerious · 2026-05-02T10:31:47Z

+    return tail
+      ? `${this.name}: ${pretty}\n${tail}`
+      : `${this.name}: ${pretty}`;
+  }


AI Review: Blocking

[util.inspect.custom] duplicates every detail line in the rendered output. Repro on the built dist:

import { inspect } from 'node:util'; import { NotInWorkflowContextError } from '@workflow/core/...'; console.log(inspect(new NotInWorkflowContextError('createHook()', 'https://example.com/docs')));

produces:

NotInWorkflowContextError: `createHook()` can only be called inside a workflow function ╰▶ docs: https://example.com/docs ╰▶ docs: https://example.com/docs ← duplicated at userCallSite (...)

For UnavailableInWorkflowContextError (3 detail lines, with active workflow context), all three render twice.

Root cause: .message is multi-line (title\n╰▶ docs: …), so V8's .stack is Name: messageLine1\nmessageLine2\nmessageLine3\n at …. split('\n').slice(1).join('\n') only strips line 1, gluing the remaining message lines onto the prepended pretty form.

Suggested fix:

const messageLines = this.message.split('\n').length; const tail = (this.stack ?? '').split('\n').slice(messageLines).join('\n');

This is the primary terminal display path users see (console.error(err), framework dev overlays, uncaught throws), so worth fixing before merge. The existing 'util.inspect(err) reveals the pretty framed form' test only asserts .toContain('╰▶') and won't catch this — recommend adding a not.toMatch(/╰▶ docs:.*╰▶ docs:/s) or a toMatchInlineSnapshot with masked stack frames.

Fixed in 9d45cdf — counted the actual message line count instead of slicing only line 1. Added a regression test in context-errors.test.ts that asserts ╰▶ docs: appears exactly once and that no detail line repeats.

VaguelySerious · 2026-05-02T10:31:47Z

+
+  constructor(message: string, options?: WorkflowBuildErrorOptions) {
+    const body = options?.hint ? `${message}\n\n${options.hint}` : message;
+    super(body, { cause: options?.cause });


AI Review: Note

This super(body, { cause: options?.cause }) call (and the equivalent in SerializationError below) hits WorkflowError's constructor, which unconditionally does this.cause = options?.cause. That makes cause: undefined an enumerable own property on every instance, so util.inspect(err) of a no-cause error renders { cause: undefined, ... }.

Not new to this PR — WorkflowError already has the issue — but the new error classes added here inherit it, so the leak surfaces on more classes post-merge.

If you'd rather not address it in this PR, fine — but a one-liner if (options?.cause !== undefined) this.cause = options.cause; in WorkflowError's constructor would clean up the inspect output for all subclasses.

Fixed in 9d45cdf — wrapped the assignment in if (options?.cause !== undefined). super(message, { cause }) already sets a non-enumerable .cause when provided, so the only thing my code needed to drop was the unconditional re-assignment that turned it into an enumerable own property. Cleans up util.inspect(err) output for every WorkflowError subclass. Changeset entry added at .changeset/workflow-error-cause-undefined.md.

VaguelySerious · 2026-05-02T10:31:47Z

+        'wrun_test123',
+        expect.any(String),
+        expect.objectContaining({ stepId: 'step_abc' })
+      );


AI Review: Note

The most user-visible behavior change in this PR — "fatal errors fail in 1 attempt, not 4" — has unit coverage on FatalError.is(ContextViolationError) === true and SerializationError.fatal === true, but no end-to-end test on the step handler itself asserting that a fatal user error emits exactly one step_failed event (no retries) and a non-fatal Error does retry to maxRetries.

If someone removes fatal = true from ContextViolationError (or moves the dehydrate call back outside the user-code try/catch) later, the unit tests stay green and the regression returns silently. A small integration-style test exercising the retry loop with a fatal vs. non-fatal user error would be a cheap regression gate.

Done in 9d45cdf — added a step-handler fatal vs retryable behavior describe block in step-handler.test.ts that exercises the live retry-loop wiring. Three cases:

error with fatal: true → exactly one step_failed, zero step_retrying

non-fatal first-attempt Error → step_retrying once, no step_failed

non-fatal final attempt → step_failed once

Catches the case where fatal = true is removed from a context-violation class — the FatalError.is() unit tests would stay green but this test would flip from "1 step_failed" to "4 step_retrying then 1 step_failed."

VaguelySerious · 2026-05-02T10:31:47Z

+    message.includes(metadata.errorMessage as string)
+  ) {
+    redundant.add('errorMessage');
+  }


AI Review: Nit

message.includes(metadata.errorMessage) is brittle: it works for the current call sites that pass ${framing}\n${stack} (which embeds the full message), but a future caller that passes a truncated errorMessage field and a full message would fail to dedupe and double-print. Not actionable today, just flagging if snapshot tests start drifting later.

Acknowledged. Agree it's a heuristic that depends on current call sites. Leaving as-is for now since all callers compose the message via ${framing}\n${stack} (and the stack always begins with the error name + message), but if a future caller passes a truncated errorMessage we'd need to switch to a more explicit "already-rendered fields" set.

VaguelySerious · 2026-05-02T10:31:47Z

  },
  "dependencies": {
    "@workflow/utils": "workspace:*",
+    "chalk": "5.6.2",


AI Review: Nit

The PR description says the change "no longer pulls chalk into every consumer" — accurate at the bundle level (the main entry doesn't import chalk, so tree-shaken consumers don't pay), but chalk is still in dependencies so it lands in node_modules for every install. Minor framing thing — if you wanted to avoid the install footprint for consumers that never touch ./ansi, this would have to move to optionalDependencies or peerDependencies. Probably not worth it; just calling out the gap between the description and the install reality.

Good catch on the framing. Updating to "no longer pulls chalk into the bundle for consumers that don't import the ./ansi subpath." Moving chalk out of dependencies to peerDependencies / optionalDependencies is more disruptive (anyone importing ./ansi would have to install chalk explicitly), so leaving it as-is — the install footprint trade-off is fine for the SDK's audience.

…lier-errors-followups * origin-https/main: [vitest] [world-local] Fix local-world data recovery isolation (#1895) ci: switch Vercel deployment-protection bypass to OIDC Trusted Sources (#1882) Add additional tests for event consumer fixes for hook/sleep/step race conditions (#1528)

The job never runs `pnpm install` (it just calls `node` against a checked-in script), so the pnpm store path never exists. The post-job `actions/setup-node@v4` cache-save then fails with `Path Validation Error: Path(s) specified in the action for caching do(es) not exist` and red-X's the entire job even though the matrix step succeeded. The setup-workflow-dev composite already has a `cache-pnpm` opt-out input for this exact case — wire it through here.

- ContextViolationError: util.inspect(err) duplicated every framed detail line because the stack-tail strip only sliced the first message line. V8's Error.stack reads `Name: messageLine1\n messageLine2\n at ...`, so for our multi-line `title\n╰▶ docs: …` messages every detail line was getting prepended twice (once in the pretty form, once via the unsliced message tail). Count the actual message lines and slice past all of them. Repro test asserts `╰▶ docs:` appears exactly once. - WorkflowError: stop assigning `cause: undefined` as an enumerable own property when no cause is provided. Subclasses (every error in this PR) inherit the parent constructor; the unconditional assignment polluted `util.inspect(err)` output with `{ cause: undefined, … }` on every no-cause instance. The `super(...)` call already conditionally sets `.cause` non-enumerably when `options.cause` is provided. - step-handler.test.ts: add a regression-gate suite that exercises the fatal-vs-retryable retry-loop wiring directly. Asserts that an error with `fatal: true` produces exactly one `step_failed` event with no `step_retrying`, and that a non-fatal `Error` retries via `step_retrying` on early attempts and emits `step_failed` once the retry budget is exhausted. Catches the silent-regression case where `fatal = true` is removed from a context-violation error class but the `FatalError.is()` unit tests stay green.

VaguelySerious

Looks good enough for me to ship and test on 5.x.x

Can we drastically shorten the changesets? I think this should be a single changeset on core/error packages that explains the user-facing changes mostly, not a file-by-file breakdown of what's changed. Also let's remove PR artifacts. Then good to ship.

Address review feedback to drastically shorten the changesets — fold the 15 file-by-file entries into a single user-facing changeset for @workflow/core / errors / builders / utils. Also drop the pr-artifacts/ folder (reviewer-only log captures, no longer needed).

pranaygp · 2026-05-03T01:28:21Z

Done in 351971a:

Consolidated all 15 changesets into a single friendlier-errors entry covering @workflow/core / errors / builders / utils. User-facing summary, no file-by-file breakdown.
Removed pr-artifacts/.

Ready to ship.

Five user-driven fixes from manual smoke-testing of #1849: 1. Logger layout. composeLogLine() now puts the structured-fields block (attribution badge, run/step IDs, error code) **between** the framing line and the stack body, instead of after it where 30+ lines of stack buried the most useful information. The framing stays at the top, stack at the bottom, structured info readable at a glance. 2. Stack trim. Drops framework-internal frames (`node_modules/.pnpm/`, `node:internal/`, Turbopack-bundled `node_modules__pnpm_*` chunks, `_next_dist_*` chunks) and caps the surviving frame count at 6 so the stack stays compact even on heavy async wrappers. Suppressed runs emit one summary line so users know the trim happened. 3. Wrapper-route noise. The nextjs-turbopack workbench's start route was catching `WorkflowRunFailedError` rejection on `Promise.race([readLoop(), run.returnValue])` and re-logging it via `console.error('Error in workflow stream:', error)` plus `controller.error(error)` — which then triggered Next.js's `⨯ failed to pipe response` overlay. The SDK already logs the failure cleanly upstream and the runId is on the response header, so the wrapper now closes the SSE stream cleanly on WorkflowRunFailedError. 4. Consistent framed `╰▶ hint:` / `╰▶ docs:` layout for all errors that carry a hint or docs slug. WorkflowError, SerializationError, and WorkflowBuildError now share one `appendFramedDetails` helper matching the box-drawing structure that ContextViolationError already used. Was: blank-line-separated `Learn more: <url>`. Now: one tree, indistinguishable from context-violation rendering. 5. Drop the duplicate logger-side `hint` field. Hints now live on the error message only — actionable hints get serialized into the event log, rehydrated on the workflow side, and shown in observability automatically. The previous logger-only hint duplicated stderr but never made it past the step boundary. Updated SerializationError hint to point at the foundations doc ("Ensure you're returning workflow serializable types. Check the serialization docs to see what's serializable: https://workflow-sdk.dev/docs/foundations/serialization") instead of the hardcoded `(plain objects, arrays, primitives, …)` list, which drifted out of sync as the supported types grew. Same hint reuses for step args, workflow args/return, stream messages, and any other site that goes through `formatSerializationError`. Also retitled the retry summary `3 retries` → `3 max retries` since "3 retries" next to "4 attempts" was ambiguous (already-happened vs. budget).

- ErrorStackBlock (web observability): show just the first non-empty trimmed line of the error message in the card title with single-line truncation. Multi-line messages (`Failed to serialize step return value\n╰▶ hint: …`) were rendering the entire framed body in the title, pushing the copy button off-screen and burying the scannability of the headline. Full message stays in the body via the stack (V8 prepends `Name: message` to `Error.stack`), so no information is lost; hover-tooltip exposes the full title text. - Persisted error message: drop the `Step "step//./.../foo"` machine name from `Step failed after N retries: …` and `Step exceeded max retries (…)` strings. Observability already attributes the event to a specific step via the UI tree, and the CLI logger emits the friendly `Step foo (./...) hit max retries` framing on its own line. Embedding the raw `step//./...` machine name in the persisted message text was duplicate noise.

Co-authored-by: Peter Wielander <mittgfu@gmail.com> Signed-off-by: Pranay Prakash <pranay.gp@gmail.com>

…lier-errors-followups * origin-https/main: [workbench] Add TanStack Start workbench and tests (#1875) Atomically dedupe duplicate step_created/wait_created events in world-local (#1877) Split tarball hosting out of docs into its own project (#1893) Replace fixed-sleep hook waits with event-driven waitForHook helper (#1879) # Conflicts: # pnpm-lock.yaml

…wups' into pranaygp/friendlier-errors-followups * origin/pranaygp/friendlier-errors-followups: Update .changeset/pretty-log-format.md Update .changeset/friendlier-errors.md

The class no longer attaches a slug-based `╰▶ docs:` line — the foundations URL is embedded directly in the hint via the `formatSerializationError` helper in @workflow/core. Update the test expectations accordingly: - bare-title case is now a single line (no docs link) - hint case renders one `╰▶ hint: …` branch (no second branch)

Four `should throw error for an unsupported type` cases were still asserting on the old hardcoded type list. Update to the new hint phrasing that points at the foundations doc, matching the change in `formatSerializationError` (`packages/core/src/serialization/errors.ts`).

pranaygp · 2026-05-04T03:53:05Z

+                  '`crypto.subtle.generateKey()` is not available inside a workflow function. Move key generation to a step function where full Node.js crypto is available.'
+                );


Let's keep this as a not implemented error. generateKey is a web crypto thing I think, not a nodejs function

pranaygp · 2026-05-04T03:58:05Z

+  const firstLine =
+    message.split('\n').find((line) => line.trim().length > 0) ?? message;
+  return firstLine.trim();
+}


even if it's just 1 line, we should have a max character length or something so that we don't render very long error messages in the title, and just leave the body to have the full message

TooTallNate

Withdrawing my prior REQUEST_CHANGES — both blockers are addressed and the related polish on top is welcome.

What's resolved

pr-artifacts/ removed (351971ac08). The 6 files plus README are gone, and the changesets were consolidated into a single friendlier-errors.md + pretty-log-format.md pair. ✓
Regression test for fatal-vs-retryable step routing added (9d45cdf6ec, step-handler.test.ts:779-887). Three new tests cover exactly the gap I flagged:
- emits exactly one step_failed and does not re-queue when the step throws an error with fatal=true — uses a user-defined class FatalUserError extends Error { readonly fatal = true }. This is precisely the case where a third-party error with fatal: true should be treated as fatal — explicitly tested end-to-end through the handler's retry loop.
- schedules a retry (and does not fail the step) on the first attempt of a non-fatal Error — positive case for the retry path.
- emits step_failed once the non-fatal retry budget is exhausted — terminal case after maxRetries.
All three pass on the PR branch when running the full file (820 core tests green).
util.inspect dedup fix in context-violation-error.ts (lines 113-127) — the original slice(1) only dropped the first line of .stack, so multi-line .message ended up duplicated when util.inspect() ran. Now slices past messageLineCount lines. There's a regression test in context-errors.test.ts:139-156 that asserts the framed ╰▶ docs: line appears exactly once — solid catch from review.
cause leak fix in WorkflowError (errors/src/index.ts:79-87) — this.cause = options?.cause was unconditionally setting the property, making it an enumerable own property even when undefined. The new conditional avoids polluting util.inspect(err) output with { cause: undefined, ... } on every no-cause subclass. Subtle but correct.

Reframing on `FatalError.is()` widening

Withdrawing my prior concern about the changeset bump type. Confirmed: this is beta-v5-only and not backported to stable, so per AGENTS.md the patch bump is fine — bump type is cosmetic on main in pre-release mode and only matters for backport-to-stable, which doesn't apply here.

The semantic widening makes more sense in context: with the upcoming custom-class serialization work in #1851 routing thrown values through the serialization pipeline, any class with fatal: true opts cleanly into the no-retry path without inheritance gymnastics. That's a clean architecture story and explicitly tested by the new regression case. Approving the design.

Non-blocking observations remaining

These were all flagged previously as non-blocking and have not been addressed. Not gating but worth tracking:

6 stepLogger.* callsites still bypass the scoped stepRuntimeLogger (lines 374, 594, 651, 717, 774, 788 in current step-handler.ts). The stepLogger namespace is intentional for debug:workflow:step:* filtering, so the choice is defensible — but it means the IDs the new forRun factory was designed to centralize get manually re-passed in those 6 places. stepLogger.forRun(...).child({ stepId, stepName }) would close this.
Lowercase [workflow] prefix still in runtime/world.ts:90 and a few Next.js builder files. These are direct console.* calls (not logger callsites), so technically out of scope for the prefix-standardization work, but worth aligning eventually.
ANSI bytes can leak into stderr drains through formatLogMetadata's Ansi.* calls. The team has explicitly chosen this trade-off (comment in log-format.ts:24 notes "web/web-shared do not consume stderr at all — they read CBOR/JSON event payloads from the World event log"). Acceptable.
Test isolation nit in the new describe block. step-handler fatal vs retryable behavior has no beforeAll to initialize capturedHandlerRef; it relies on the sibling step-handler 409 handling block's beforeAll running first. Means the new tests fail when run in isolation (vitest -t "fatal vs retryable"). Easy to fix by duplicating the beforeAll, but only relevant to anyone reordering or pruning the file later.

Wrap-up

Big PR, well-executed end-to-end. The structured FramedContent/renderPlain/renderPretty split, the chalk subpath, the consolidated changeset, the regression tests added in response to review — all good craftsmanship. Ship.

…lier-errors-followups * origin-https/main: Fix pnpm type issue after tanstack PR (#1907) ci: pass stale-banner via path: to sticky-pull-request-comment in tests + benchmarks workflows (#1887)

pranaygp and others added 13 commits April 22, 2026 18:28

Use double-quoted changeset frontmatter per repo convention

b2b1587

Use double-quoted changeset frontmatter per repo convention

351d330

Use double-quoted changeset frontmatter per repo convention

d8b41c0

Copilot AI review requested due to automatic review settings April 24, 2026 02:08

Copilot started reviewing on behalf of pranaygp April 24, 2026 02:09 View session

vercel Bot deployed to Preview – workflow-web April 24, 2026 02:09 View deployment

vercel Bot deployed to Preview – workbench-nitro-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-hono-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-vite-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-astro-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-fastify-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-sveltekit-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-nuxt-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – workbench-express-workflow April 24, 2026 02:10 View deployment

pranaygp mentioned this pull request Apr 24, 2026

Friendlier context-violation errors + Ansi renderer #1831

Closed

8 tasks

vercel Bot deployed to Preview – example-workflow April 24, 2026 02:10 View deployment

vercel Bot deployed to Preview – example-nextjs-workflow-turbopack April 24, 2026 02:10 View deployment

pranaygp added 2 commits May 1, 2026 10:06

TooTallNate requested changes May 2, 2026

View reviewed changes

VaguelySerious reviewed May 2, 2026

View reviewed changes

pranaygp added 4 commits May 2, 2026 19:34

Merge branch 'main' into pranaygp/friendlier-errors-followups

d900ccd

VaguelySerious reviewed May 2, 2026

View reviewed changes

VaguelySerious approved these changes May 3, 2026

View reviewed changes

Comment thread .changeset/friendlier-errors.md Outdated

Comment thread .changeset/pretty-log-format.md Outdated

pranaygp and others added 8 commits May 4, 2026 10:47

Update .changeset/friendlier-errors.md

eba998f

Co-authored-by: Peter Wielander <mittgfu@gmail.com> Signed-off-by: Pranay Prakash <pranay.gp@gmail.com>

Update .changeset/pretty-log-format.md

5230a82

Co-authored-by: Peter Wielander <mittgfu@gmail.com> Signed-off-by: Pranay Prakash <pranay.gp@gmail.com>

Merge remote-tracking branch 'origin/pranaygp/friendlier-errors-follo…

7181f17

…wups' into pranaygp/friendlier-errors-followups * origin/pranaygp/friendlier-errors-followups: Update .changeset/pretty-log-format.md Update .changeset/friendlier-errors.md

pranaygp commented May 4, 2026

View reviewed changes

TooTallNate approved these changes May 4, 2026

View reviewed changes

Merge remote-tracking branch 'origin-https/main' into pranaygp/friend…

5b8d791

…lier-errors-followups * origin-https/main: Fix pnpm type issue after tanstack PR (#1907) ci: pass stale-banner via path: to sticky-pull-request-comment in tests + benchmarks workflows (#1887)

		'`crypto.subtle.generateKey()` is not available inside a workflow function. Move key generation to a step function where full Node.js crypto is available.'
		);

Conversation

pranaygp commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Fixes from manual testing (createHook() inside a step)

Follow-up round (latest commits)

Post-review polish (latest commits)

Addressed review feedback

Manual test plan

1. Context-violation errors (phase 1 + 2 + followups)

2. Runtime logger metadata (phase 3)

3. SerializationError (phase 4)

4. describeError attribution (phase 5)

5. Consistency pass (phase 6)

6. describeError subpath (phase 7 foundation)

7. WorkflowBuildError (phase 8)

8. New @workflow/errors/ansi subpath (final PR)

9. FatalError.is() widening (final PR)

Unit tests

Smoke-test harness (local)

Uh oh!

changeset-bot Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧪 E2E Test Results

Summary

Details by Category

Uh oh!

TooTallNate left a comment

Choose a reason for hiding this comment

Blocking

pr-artifacts/ directory needs to be removed before merge

Higher-risk concerns

FatalError.is() widening is a runtime-behavior change

Step-handler dehydrate move is correct but undertested

Other observations (non-blocking)

Minor design

Logger / log-format

Module-realm hazards

Minor

Doc

Uh oh!

VaguelySerious left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

AI Review: Blocking

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VaguelySerious left a comment

Choose a reason for hiding this comment

Uh oh!

pranaygp commented May 3, 2026

Uh oh!

Uh oh!

Uh oh!

pranaygp commented Apr 24, 2026 •

edited

Loading

Fixes from manual testing (`createHook()` inside a step)

3. `SerializationError` (phase 4)

4. `describeError` attribution (phase 5)

6. `describeError` subpath (phase 7 foundation)

7. `WorkflowBuildError` (phase 8)

8. New `@workflow/errors/ansi` subpath (final PR)

9. `FatalError.is()` widening (final PR)

changeset-bot Bot commented Apr 24, 2026 •

edited

Loading

vercel Bot commented Apr 24, 2026 •

edited

Loading

github-actions Bot commented Apr 24, 2026 •

edited

Loading

`pr-artifacts/` directory needs to be removed before merge

`FatalError.is()` widening is a runtime-behavior change

Reframing on `FatalError.is()` widening